Search CORE

147 research outputs found

Augmenting PROV with Plans in P-PLAN: Scientific Processes as Linked Data

Author: Garijo Daniel
Gil Yolanda
Publication venue: Facultad de Informática (UPM)
Publication date: 01/01/2012
Field of study

Provenance models are crucial for describing experimental results in science. The W3C Provenance Working Group has recently released the PROV family of specifications for provenance on the Web. While provenance focuses on what is executed, it is important in science to publish the general methods that describe scientific processes at a more abstract and general level. In this paper, we propose P-PLAN, an extension of PROV to represent plans that guid-ed the execution and their correspondence to provenance records that describe the execution itself. We motivate and discuss the use of P-PLAN and PROV to publish scientific workflows as Linked Data

Archivo Digital UPM

inspect4py : a knowledge extraction framework for Python code repositories

Author: Filgueira Rosa
Garijo Daniel
Publication venue: ACM
Publication date: 24/03/2022
Field of study

This work presents inspect4py, a static code analysis framework designed to automatically extract the main features, metadata and documentation of Python code repositories. Given an input folder with code, inspect4py uses abstract syntax trees and state of the art tools to find all functions, classes, tests, documentation, call graphs, module dependencies and control flows within all code files in that repository. Using these findings, inspect4py infers different ways of invoking a software component. We have evaluated our framework on 95 annotated repositories, obtaining promising results for software type classification (over 95% F1-score). With inspect4py, we aim to ease the understandability and adoption of software repositories by other researchers and developers.Postprin

St Andrews Research Repository

A new approach for publishing workflows: abstractions, standards, and linked data

Author: Garijo Verdejo Daniel
Gil Yolanda
Publication venue: Facultad de Informática (UPM)
Publication date: 01/01/2011
Field of study

In recent years, a variety of systems have been developed that export the workflows used to analyze data and make them part of published articles. We argue that the workflows that are published in current approaches are dependent on the specific codes used for execution, the specific workflow system used, and the specific workflow catalogs where they are published. In this paper, we describe a new approach that addresses these shortcomings and makes workflows more reusable through: 1) the use of abstract workflows to complement executable workflows to make them reusable when the execution environment is different, 2) the publication of both abstract and executable workflows using standards such as the Open Provenance Model that can be imported by other workflow systems, 3) the publication of workflows as Linked Data that results in open web accessible workflow repositories. We illustrate this approach using a complex workflow that we re-created from an influential publication that describes the generation of 'drugomes'

Archivo Digital UPM

Mapping of the Music Ontology to the Media Value Chain Ontology and the PROV Ontology

Author: Garijo Daniel
Rodriguez-Doncel Victor
Publication venue: Facultad de Informática (UPM)
Publication date: 18/12/2012
Field of study

Mapping of the Music Ontology to the Media Value Chain Ontology and the PROV Ontolog

Archivo Digital UPM

Linking Abstract Plans of Scientific Experiments to their Corresponding Execution Traces

Author: Edwards Peter
Garijo Daniel
Markovic Milan
Publication venue: CEUR-WS
Publication date: 21/10/2019
Field of study

ACKNOWLEDGMENTS The work described in this paper was funded by the award made by the RCUK Digital Economy programme to the University of Aberdeen (EP/N028074/1), a SICSAPECE travel award, the Defense Advanced Research Projects Agency with award W911NF-18-10027, the SIMPLEX program with award W911NF-15-1-0555 and from the National Institutes of Health under awards 1U01CA196387 and 1R01GM117097.Publisher PD

Aberdeen University Research

Challenges in Modeling Geospatial Provenance

Author: Garijo Daniel
Gil Yolanda
Harth Andreas
Publication venue: IPAW
Publication date: 01/01/2014
Field of study

The surge in availability of geospatial data sources, the increased use of crowdsourced maps and the advent of geospatial mashups have brought us to an era where geospatial information is delivered to users after integration from divers sources. Understanding the provenance of geospatial data is crucial for assessing the quality of the data and addressing whether to trust the information or not. In this paper we describe user requirements for modeling geospatial provenance

KITopen

PROV-O: The PROV Ontology Tutorial

Author: Garijo Daniel
Publication venue: Facultad de Informática (UPM)
Publication date: 01/01/2013
Field of study

Provenance is key for describing the evolution of a resource, the entity responsible for its changes and how these changes affect its final state. A proper description of the provenance of a resource shows who has its attribution and can help resolving whether it can be trusted or not. This tutorial will provide an overview of the W3C PROV data model and its serialization as an OWL ontology. The tutorial will incrementally explain the features of the PROV data model, from the core starting terms to the most complex concepts. Finally, the tutorial will show the relation between PROV-O and the Dublin Core Metadata terms

Archivo Digital UPM

Research Objects

Author: Garijo Daniel
Publication venue: Facultad de Informática (UPM)
Publication date: 01/01/2013
Field of study

In the domain of eScience, investigations are increasingly collaborative. Most scientific and engineering domains benefit from building on top of the outputs of other research: By sharing information to reason over and data to incorporate in the modelling task at hand. This raises the need to provide means for preserving and sharing entire eScience workflows and processes for later reuse. It is required to define which information is to be collected, create means to preserve it and approaches to enable and validate the re-execution of a preserved process. This includes and goes beyond preserving the data used in the experiments, as the process underlying its creation and use is essential. This tutorial thus provides an introduction to the problem domain and discusses solutions for the curation of eScience processes

Archivo Digital UPM